Network robustness and fragility: Percolation on random graphs 
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Recent work on the internet, social networks, and the power grid has addressed the resilience of 
these networks to either random or targeted deletion of network nodes. Such deletions include, for 
example, the failure of internet routers or power transmission lines. Percolation models on random 
graphs provide a simple representation of this process, but have typically been limited to graphs 
with Poisson degree distribution at their vertices. Such graphs are quite unlike real world networks, 
which often possess power-law or other highly skewed degree distributions. In this paper we study 
percolation on graphs with completely general degree distribution, giving exact solutions for a variety 
of cases, including site percolation, bond percolation, and models in which occupation probabilities 
depend on vertex degree. We discuss the application of our theory to the understanding of network 
resilience. 



The internet, airline routes, and electric power grids 
are all examples of networks whose function relies cru- 
cially on the pattern of interconnection between the com- 
ponents of the system. An important property of such 
connection patterns is their robustness — or lack thereof — 
to removal of network nodes pL which can be modeled 
as a percolation process on a graph representing the net- 
work M. Vertices on the graph are considered occupied 
or not, depending on whether the network nodes they 
represent (routers, airports, power stations) are function- 
ing normally. Occupation probabilities for different ver- 
tices may be uniform, or may depend on, for example, 
the number of connections they have to other vertices, 
also called the vertex degree. Then we observe the prop- 
erties of percolation clusters on the graph, particularly 
their connectivity, as the function determining occupa- 
tion probability is varied. Previous results on models of 
this type [|l]-f| suggest that, if the connection patterns are 
chosen appropriately, the network can be made highly 
resilient to random deletion of nodes, although it may 
be susceptible to an "attack" which specifically targets 
nodes of high degree. We can also consider bond per- 
colation on graphs as a model of robustness of networks 
to failure of the links between nodes (e.g., fiber optic 
lines, power transmission cables, and so forth), or com- 
bined site and bond percolation as a model of robustness 
against failure of either nodes or links. 

Percolation models built on networks have also been 
used to model the spread of disease through communi- 
ties . In such models a node in the network repre- 
sents a potential host for the disease, and is occupied if 
that host is susceptible to the disease. Links between 
nodes represent contacts capable of transmitting the dis- 
ease between individuals and may be occupied with some 
prescribed probability to represent the fraction of such 
contacts which actually result in transmission. A perco- 
lation transition in such a model represents the onset of 
an epidemic. Similar models can be used to represent the 
propagation of computer viruses ||. 

The simplest and most widely studied model of undi- 
rected networks is the random graph j^j] , which has been 



investigated in depth for several decades now. However, 
random graphs suffer (at least) one serious shortcom- 
ing. As pointed out by a number of authors |p|p|-|TT|, 
vertex degrees have a Poisson distribution in a random 
graph, but real-life degree distributions are strongly non- 
Poisson, often taking power-law, truncated power-law, 
or exponential forms. This has prompted researchers to 
study the properties of generalized random graphs which 
have non-Poisson degree distributions |lj-|lj] . 

In this paper we employ the generating function for- 
malism of Newman et al. to find exact analytic so- 
lutions for site percolation on random graphs with any 
probability distribution of vertex degree, where occupa- 
tion probability is an arbitrary function of vertex degree. 
For the special case of constant occupation probability, 
we also give solutions for bond and joint site/bond perco- 
lation. Our results indicate how robust networks should 
be to random deletion of vertices or edges, or to the pref- 
erential deletion of vertices with particular degree. 

We start by examining site percolation for the gen- 
eral case in which occupation probability is an arbitrary 
function of vertex degree. Let p k be the probability that 
a randomly chosen vertex has degree k, and qu be the 
probability that a vertex is occupied given that it has 
degree k. Then puqu is the probability of having degree 
k and being occupied, and 



F (x) 



fe=0 



(1) 



is the probability generating function for this distribu- 
tion p5fl. (Generating functions of this form were previ- 
ously used by Watts jl6) to study cascading failures in 
networks.) Note that Fq(1) — q, where q is the over- 
all fraction of occupied sites. If we wish to study the 
special case of uniform occupation probability — ordinary 
site percolation — we simply set qu = q for all k. 

If we follow a randomly chosen edge, the vertex we 
reach has degree distribution proportional to kp k rather 
than just p k because a randomly chosen edge is more 
likely to lead to a vertex of higher degree. Hence the 
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equivalent of ([[J) for such a vertex is [Q 
Efc^Pfe*^ 1 _ F o( x ) 
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where z is the average vertex degree. 

Now let Hi (x) be the generating function for the prob- 
ability that one end of a randomly chosen edge on the 
graph leads to a percolation cluster of a given number of 
occupied vertices. The cluster may contain zero vertices 
if the vertex at the end of the edge in question is unoc- 
cupied, which happens with probability 1 — -Fi(l), or the 
edge may lead to an occupied vertex with a number k 
of other edges leading out of it, distributed according to 
Fi (x) . This means that Hi (x) satisfies a self-consistency 
condition of the form 



Hi{x) = l-Fi(l) + xFi{Hi(x)). 



(3) 



The probability distribution for the size of the cluster 
to which a randomly chosen vertex belongs is similarly 
generated by Hq(x), where 



H (x) = l-Fo(l) +xF (Hi(x)). 



(4) 



Together, Eqs. (0-g) deter mine the cluster size distribu- 
tion for site percolation on a graph of arbitrary degree 
distribution. From these equations we can determine sev- 
eral quantities of interest such as mean cluster size, po- 
sition of the percolation threshold, and giant component 
size, as demonstrated below. 

For the special case of uniform (degree-independent) 
site occupation probability, = q for all k, Eqs. (||) 
and (H) simplify to 



Hi{x) = l-q + qxG 1 {H 1 {x)), 
H (x) = l-q + qxG (Hi(x)), 



(5) 
(G) 



where Go(x) = ^2 k PkX k and Gi(x) = G' (x)/z are the 
generating functions for vertex degree alone introduced in 
Ref. [ fl4| . For bond percolation with uniform occupation 
probability, we find that 



H (x) = xG {Hi(x)), 



(7) 



with Hi(x) given by Eq. (|J) again, and for joint site/bond 
percolation with uniform site and bond occupation prob- 
abilities q s and qb, we have 



Hi(x) = 1 - q s q b + q s q b xGi(Hi(x)), 
H (x) = l-q s + q s xG (Hi(x)), 



(9) 



and indeed Eqs. (||-0) may be considered special cases of 
these last two equations when either q s or q b is 1. 

We now apply these results to the study of network 
robustness in a variety of cases. First, we consider the 
case of uniform site occupation probability embodied in 
Eqs. (|^) and (|^), which corresponds to random removal 
of nodes from a network, for example through failure of 



routers in a data network, or through random vaccination 
of a population against a disease. 

Typically, no closed- form solution exists for Eq. (||), 
but it is possible to determine the terms of Hi (x) to any 
finite order n by iterating Eq. (0) n + 1 times starting 
from an initial value of Hi — 1. The probability distri- 
bution of cluster sizes can then be calculated exactly by 
substituting into Eq. (|J) and expanding about x = 0. To 
test this method, we have performed simulations [|l9| of 
site percolation on random graphs with vertex degrees 
distributed according to the truncated power law 



Pk = 







for k = 
for k > 1. 



(10) 



Our reasons for choosing this distribution are two-fold. 
First, it is seen in a number of real- world social networks 
including collaboration networks of movie actors [fi"l| and 
scientists pC( |. The pure power-law distributions seen in 
internet data |l0|] are also included in (|l(J) as a spe- 
cial case k — ► oo. Second, the distribution has technical 
advantages over a pure power-law form because the ex- 
ponential cutoff regularizes the calculations, so that the 
generating functions and their derivatives are finite. For 
pure power-law forms on the other hand, the calculations 
diverge, indicating that real-world networks cannot take 
a pure power-law form and must have some cutoff (pre- 
sumably dependent on the system size). 

Figure |l| shows the cluster size distribution from our 
simulations along with the exact solution for the same 
values from the generating function formalism. The 
agreement between the two is good. 

The sizes of the clusters correspond, for instance, to 
the sizes of outbreaks of a disease among groups of sus- 
ceptible individuals. The parameter values used in Fig. |l| 
are below the percolation threshold for this particular de- 
gree distribution, and hence all outbreaks are small and 
there is no epidemic behavior. The mean cluster size is 



( S )=H' (l)=q + qG' (l)H' 1 (l) = q 
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l-qG'i(l) 



(11) 



which diverges when 1 — qG[(l) — 0. This point 
marks the percolation threshold of the system, the point 
at which a giant component of connected vertices first 
forms. Thus the critical occupation probability is 



(12) 
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A result equivalent to this one has been derived previ- 
ously by Cohen et al. J2| by different means. 

In the language of disease propagation q c is the point at 
which an epidemic of the disease first occurs. In the lan- 
guage of network robustness, it is the point at which the 
network achieves large scale connectivity, and can there- 
fore function as an effective distribution network. Con- 
versely, if we are approaching the transition from values 
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of q above q c it is the point at which a sufficient number 
of individuals are immune to a disease to prevent it from 
spreading, or the point at which a large enough number 
of nodes have been deleted from a distribution network 
to prevent distribution on large-scales. 

The inset of Fig. [j] shows the behavior of the percola- 
tion threshold with the cutoff parameter k for a variety of 
values of r. Note that as the values of k become large, the 
percolation threshold becomes small, indicating a high 
degree of robustness of the network to random deletion 
of nodes. For r = 2.5 (roughly the exponent for the 
internet data g]) and k = 100, the percolation thresh- 
old is q c = 0.17, indicating that one can remove more 
than 80% of the nodes in the network without destroy- 
ing the giant component — the network will still possess 
large-scale connectivity. This result agrees with recent 
studies of the internet Jl|,|| which indicate that network 
connectivity should be highly robust against the random 
removal of nodes. 

Another issue that has attracted considerable recent 
attention is the question of robustness of a network to 
non-random deletion targeted specifically at nodes with 
high degree. Albert et al. [Q and Broder et al. ||| both 
looked at the connectivity of a network with power-law 
distributed vertex degrees as the vertices with highest 
degree were progressively removed. In the language of 
our percolation models, this is equivalent to setting 



qk = 9(k n 



k), 



(13) 



where is the Heaviside step- function |21|. This removes 
(unoccupies) all vertices with degree greater than /c max . 
To investigate the effect of this removal, we calculate the 
size of the giant component in the network, if there is one. 
Above the percolation transition the generating function 
H (x) gives the distribution of the sizes of clusters of ver- 
tices which are not in the giant component jl7| , which 
means that -ffo(l) is equal to the fraction of the graph 
which is not occupied by the giant component. The frac- 
tion S which is occupied by the giant component is there- 
fore given by 

S=l-H (l) = F (l)-F (u), (14) 

where u is a solution of the self-consistency condition 



l-Fx^+Fxiu). 



(15) 



In cases where this last equation is not exactly solvable 
we can evaluate u by numerical iteration starting from a 
suitable initial value. In Fig. || we show the results for 
S from this calculation for graphs with pure power-law 
degree distributions as a function of /c max for a variety 
of values of r. (The removal of vertices with high degree 
regularizes the calculation in a similar way to the inclu- 
sion of the cutoff k in our earlier calculation, so no other 
cutoff is needed in this case.) On the same plot we also 
show simulation results for this problem, and once more 
agreement of theory and simulation is good. 
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FIG. 1. Probability P s that a randomly chosen vertex be- 
longs to a cluster of s sites for k = 10, r = 2.5, and p = 0.65 
from numerical simulation on systems of 10 7 sites (circles) and 
our exact solution (solid line). Inset: the percolation thresh- 
old q c from Eq. ( |l2] ) (solid lines) , versus computer simulations 
with r = 1.5 (circles), 2.0 (squares), and 2.5 (triangles). 



Opinions appear to differ over whether networks such 
as this are robust or fragile to this selective removal of 
vertices. Albert et al. [Q point out that only a small 
fraction of the highest-degree vertices need be removed 
to destroy the giant component in the network and 
hence remove all long-range connectivity. Conversely, 
Broder et al. point out that one can remove all ver- 
tices with degree greater than fc max and still have a giant 
component even for surprisingly small values of fc max . As 
we show in Fig. g, both viewpoints are correct: they are 
merely different representations of the same data. In the 
upper frame of the figure, we plot giant component size 
as a function of the fraction of vertices removed from the 
network, and it is clear that the giant component disap- 
pears when only a small percentage are removed — just 
1% for the case r = 2.7 — so that the network appears 
fragile. In the lower frame we show the same data as 
a function of fcmax, the highest remaining degree vertex, 
and we see that when viewed in this way the network 
is, in a sense, robust, since fc max must be very small to 
destroy the giant component completely — just 10 in the 
case of r = 2.7. 

To conclude, we have used generating function meth- 
ods to solve exactly for the behavior of a variety of per- 
colation models on random graphs with any distribu- 
tion of vertex degrees, including uniform site, bond and 
site/bond percolation, and percolation in which occupa- 
tion probability is a function of vertex degree. Percola- 
tion systems on graphs such as these have been suggested 
as models for the robustness of communication or distri- 
bution networks to breakdown or sabotage, and for the 
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FIG. 2. Size of the giant component 5* in graphs with 
power-law degree distribution and all vertices with degree 
greater than fc max unoccupied, for r = 2.4 (circles), 2.7 
(squares), and 3.0 (triangles). Points are simulation results 
for systems with 10 7 vertices, solid lines are the exact solution. 
Upper frame: as a function of fraction of vertices unoccupied. 
Lower frame: as a function of the cutoff parameter fc max . 



spread of disease through communities possessing some 
resistance to infection. Our exact solutions allow us to 
make predictions about the behavior of such model sys- 
tems under quite general types of breakdown or interfer- 
ence. Among other results, we find that a distribution 
network such as the internet, which has an approximately 
power-law vertex degree distribution, should be highly 
robust against random removal of nodes (for example, 
random failure of routers), but is relatively fragile, at 
least in terms of fraction of nodes removed, to the spe- 
cific removal of the most highly connected nodes. 
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by the National Science Foundation, the Electric Power 
Research Institute, and the Army Research Office. 
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discussions of this point. 
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In fact, the approaches of Albert et al. [J| and 
Broder et al. Q differ slightly. Broder et al. simply re- 
moved the highest degree vertices from the network, 
whereas Albert et al. recalculated vertex degrees after 
the removal of each vertex and its associated edges, and 
then removed the next highest degree vertex. Our cal- 
culations are equivalent to the method of Broder et al., 
although in practice there appears to be little difference 
in qualitative behavior between the two. 
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